{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "source": [
        "<font color=\"#a9a56c\" size=2> **@Author: Arif Kasim Rozani - (Team Operation Badar)** </font>"
      ],
      "metadata": {
        "id": "38_QpUXJOAII"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "\n",
        "### 🧪 What kinds of strings are automatically interned?\n",
        "\n",
        "Python tends to **auto-intern**:\n",
        "\n",
        "-   Short strings\n",
        "    \n",
        "-   Strings that look like valid identifiers (`\"hello\"`, `\"var123\"`, etc.)\n",
        "    \n",
        "-   Strings used frequently in code (like keywords or literals)"
      ],
      "metadata": {
        "id": "fnFrAd6OL7GH"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "\n",
        "But Python **avoids interning** strings that:\n",
        "\n",
        "-   Contain **whitespace**\n",
        "    \n",
        "-   Are not valid identifiers (e.g. `'a a'`, `'123 456'`, `'@#%!'`)\n",
        "    \n",
        "-   Are **long or dynamically created**\n",
        "    \n",
        "\n",
        "Why? Because:\n",
        "\n",
        "### 🧾Reason:\n",
        "\n",
        "> Python avoids interning strings like `'a a'` because **they’re less likely to be reused** and **more likely to be dynamically generated** (e.g., from user input, file reads, etc.). Interning them would waste memory and CPU cycles without much gain."
      ],
      "metadata": {
        "id": "EArO577zPtU1"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a b'\n",
        "b = 'a b'\n",
        "print(a is b)  # False"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "8pkRgJ9hP_7u",
        "outputId": "7db459e1-cc78-4f50-bbaf-2c830c5b4111"
      },
      "execution_count": 1,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "markdown",
      "source": [
        "\n",
        "Even though `'a b'` is **short**, it's **not auto-interned** because it contains a **space**, and Python avoids interning strings that don’t look like identifiers.\n",
        "\n",
        "----------\n",
        "\n",
        "### 🚫 Why `'a b'` is not interned automatically:\n",
        "\n",
        "-   It’s short ✅\n",
        "    \n",
        "-   It’s immutable ✅\n",
        "    \n",
        "-   But it **doesn’t look like a variable name** ❌\n",
        "    \n",
        "\n",
        "> Python’s auto-interning is optimized for identifier-like strings: things like `'abc'`, `'var_1'`, or `'x123'`."
      ],
      "metadata": {
        "id": "XSr5nojqQIkT"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "\n",
        "## 🧠 String Interning in Python — The Rules\n",
        "\n",
        "Python **automatically** interns some strings **at compile time**, but only under specific conditions.\n",
        "\n",
        "Here’s what determines whether a string will be **auto-interned**:\n",
        "\n",
        "----------\n",
        "\n",
        "### ✅ **Auto-interned (usually)**\n",
        "\n",
        "1.  **Identifiers / Valid Variable Names**\n",
        "    \n",
        "    -   Strings like `'hello'`, `'var123'`, `'X_y'`\n",
        "        \n",
        "    -   Basically, anything that could be a Python variable name\n",
        "        \n",
        "    -   ✅ Interned by default\n",
        "        \n",
        "2.  **Short strings (usually < 20 characters)**\n",
        "    \n",
        "    -   Even if not identifiers, some very short strings might still be interned in CPython depending on usage.\n",
        "        \n",
        "    -   ✅ Often interned — **but not guaranteed**\n",
        "        \n",
        "3.  **Strings used as constants in code**\n",
        "    \n",
        "    -   For example, two occurrences of the same string literal in the same `.py` file will often be interned.\n",
        "        \n",
        "    -   ✅ Interned during compilation\n",
        "        \n",
        "4.  **Compile-time literals**\n",
        "    \n",
        "    -   Strings hardcoded in your code (not dynamically created).\n",
        "        \n",
        "    -   ✅ Interned by the compiler if it decides it’s worth it\n",
        "        \n",
        "\n",
        "----------\n",
        "\n",
        "### ❌ **Not auto-interned (unless forced)**\n",
        "\n",
        "1.  **Strings with spaces**\n",
        "    \n",
        "    -   `'hello world'`, `'a b'`\n",
        "        \n",
        "    -   ❌ Not valid identifiers, so not interned\n",
        "        \n",
        "2.  **Strings with special characters**\n",
        "    \n",
        "    -   `'@user'`, `'a+b'`, `'!error'`\n",
        "        \n",
        "    -   ❌ Not valid identifiers\n",
        "        \n",
        "3.  **Strings created dynamically (runtime)**\n",
        "    \n",
        "    -   From input, file reads, string concatenation, or string formatting\n",
        "        \n",
        "    -   ❌ Not interned automatically\n",
        "        \n",
        "4.  **Multiline or long strings**\n",
        "    \n",
        "    -   Even if content is repeated, they’re too big to intern efficiently\n",
        "        \n",
        "    -   ❌ Skipped to save memory"
      ],
      "metadata": {
        "id": "PA9omsP9QYHs"
      }
    },
    {
      "cell_type": "markdown",
      "source": [
        "## **Examples**"
      ],
      "metadata": {
        "id": "X7aAJZE_R55b"
      }
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a&b'\n",
        "b = 'a&b'\n",
        "print(\"a is b = \", a in b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "psY6rbH3RLdD",
        "outputId": "a6e2e202-7ee4-4dec-ce36-e885f0ae2eff"
      },
      "execution_count": 3,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  True\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a b'\n",
        "b = 'a b'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "K-jmn-FWRboj",
        "outputId": "5046ba3b-6d5e-42d3-a045-348c368448b8"
      },
      "execution_count": 5,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a+b'\n",
        "b = 'a+b'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "_6Ke4T_lRkxq",
        "outputId": "855e8404-503e-4634-97cd-0eb373326a67"
      },
      "execution_count": 6,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a\\nb'\n",
        "b = 'a\\nb'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "IIgW3KXNRm0z",
        "outputId": "2f41e72d-979f-4283-8c8b-401054b12739"
      },
      "execution_count": 7,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a\\tb'\n",
        "b = 'a\\tb'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "62NRZ4dsRpzs",
        "outputId": "d42623c4-f9a5-49cb-cc95-32af25b1b23b"
      },
      "execution_count": 8,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a=b'\n",
        "b = 'a=b'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "ABWCO5CaRsS6",
        "outputId": "f44d7177-aca8-4333-e886-b232d0b1e205"
      },
      "execution_count": 9,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a#b'\n",
        "b = 'a#b'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "qhOiSvwIRsBL",
        "outputId": "f2686f34-ec60-4eae-fdb3-3912a687d328"
      },
      "execution_count": 10,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a-b'\n",
        "b = 'a-b'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "qXJDRasiSASy",
        "outputId": "621bb59a-1934-40eb-cc71-7b3dbe07ad23"
      },
      "execution_count": 11,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a/b'\n",
        "b = 'a/b'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "_o3Ggt7pSCUS",
        "outputId": "26e80169-bb97-4184-aa09-92f572b2bc2f"
      },
      "execution_count": 12,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a  = 'a^b'\n",
        "b  = 'a^b'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Wd3axsqOSD-6",
        "outputId": "df08033a-ab20-46ed-d5ff-cf7bc3914eea"
      },
      "execution_count": 13,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a%b'\n",
        "b = 'a%b'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "hfm-I9Z0SG0L",
        "outputId": "bf3eed18-c70e-453b-f6fd-f72ff7810a12"
      },
      "execution_count": 14,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a$b'\n",
        "b = 'a$b'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "DOm0afBSSIcq",
        "outputId": "4448d751-c154-4ed9-df92-74f4a3bda07a"
      },
      "execution_count": 16,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  False\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = 'a_b'\n",
        "b = 'a_b'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Qwg6eOimSaiy",
        "outputId": "0cc9afa1-b808-4865-ec0d-fb1fe45fd442"
      },
      "execution_count": 17,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  True\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = '5j'\n",
        "b = '5j'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Ti3j_nd2Seoi",
        "outputId": "02680b0a-294b-4304-c719-d26284df782a"
      },
      "execution_count": 18,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  True\n"
          ]
        }
      ]
    },
    {
      "cell_type": "code",
      "source": [
        "a = '\\u0041' # A\n",
        "b = '\\u0041'\n",
        "print(\"a is b = \", a is b)"
      ],
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "6z-8WDeHSryB",
        "outputId": "0384db91-9fee-4bc9-9675-f42461f29eaa"
      },
      "execution_count": 28,
      "outputs": [
        {
          "output_type": "stream",
          "name": "stdout",
          "text": [
            "a is b =  True\n"
          ]
        }
      ]
    }
  ]
}